[2025秋季][T1-1-7] GreenHandHand #776

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

GreenHandHand wants to merge 2 commits into InfiniTensor:main from GreenHandHand:2025-autumn-GreenHandHand-T1-1-7

GreenHandHand commented Dec 13, 2025

描述

Infinicore 赛题 T1-1-7，算子 logsumexp, lp_pool1d, lp_pool2d, lp_pool3d, max 的 cpu 实现，gpu 部分实现使用 ninetoothed，在 ntops 的 pr 中。

值得陈述的技术细节

技术细节

lp_poolnd 算子在 torch 中使用的是近似实现，即 $\sqrt[p]{\text{mean}(W_i^p) \times \text{KernelSize}}$，因此在 ceil_mode=True 时行为存在不一致，这里采用了与 pytorch 结果一致的实现方式。具体见 lp_poolnd 的 kernel 代码。

一些问题

logsumexp 最后一个测试用例由于out存在数据重叠的问题(stride过小)，导致 torch cpu 输出结果不唯一。因此省略了。
摩尔线程 max 算子存在 long 不支持 nan 的问题。
摩尔线程 max 算子的 gpu 实现无法处理不连续的输入。

运行截图

cpu

summary

logsumexp

lp_pool1d

lp_pool2d

max

nvidia

summary

logsumexp

lp_pool1d

lp_pool2d

lp_pool3d

max

摩尔线程

摩尔线程平台自带的 max 算子在不连续张量上的实现存在问题，且不支持 global max，因此这里跳过了一些测试。

summary

logsumexp

lp_pool1d

lp_pool2d

lp_pool3d

max

沐曦

summary

logsumexp

lp_pool1d

lp_pool2d

lp_pool3d

max

天数

summary

logsumexp

lp_pool1d

lp_pool2d

lp_pool3d

max

HONOR_CODE

REFERENCE

pytorch 官方文档、triton 官方文档。


          Infinicore 比赛，实现 logsumexp, lp_pool1d, lp_pool2d, lp_pool3d, max

c879b27

GreenHandHand mentioned this pull request

[2025秋季][T1-1-7] GreenHandHand InfiniTensor/ntops#59

Open


          修正类型命名

3dfbc8b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet