add max 2d subarray sum #81

SungHwanYun · 2025-09-10T22:12:28Z

add max 2d subarray sum

Copilot

Pull request overview

This PR adds a new medium-difficulty coding challenge called "Max 2D Subarray Sum" that requires computing the maximum sum of any contiguous 2D subarray of a fixed window size. The challenge includes starter templates for multiple GPU programming frameworks and comprehensive test cases.

Implements reference solution using 2D prefix sum approach with PyTorch
Provides starter templates for Triton, PyTorch, Mojo, and CUDA implementations
Includes example, functional, and performance test cases with various edge cases

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`challenges/medium/55_max_2d_subarray_sum/challenge.py`	Challenge implementation with reference solution, test generation, and function signatures
`challenges/medium/55_max_2d_subarray_sum/challenge.html`	HTML documentation describing the problem, examples, and constraints
`challenges/medium/55_max_2d_subarray_sum/starter/starter.triton.py`	Triton starter template with function signature
`challenges/medium/55_max_2d_subarray_sum/starter/starter.pytorch.py`	PyTorch starter template with function signature
`challenges/medium/55_max_2d_subarray_sum/starter/starter.mojo`	Mojo starter template with function signature
`challenges/medium/55_max_2d_subarray_sum/starter/starter.cu`	CUDA starter template with function signature

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-03T21:49:33Z

challenges/medium/55_max_2d_subarray_sum/challenge.py

+        assert output.dtype == torch.int32
+
+        psum = input.cumsum(dim=0).cumsum(dim=1)
+        padded = torch.zeros((N+1, N+1), dtype=torch.int32)


The padded tensor is created on CPU by default with torch.zeros, but the input tensor is on GPU (cuda device). This will cause a runtime error when trying to assign psum to padded[1:, 1:] since tensors need to be on the same device. The padded tensor should be created on the same device as the input tensor.

Suggested change

padded = torch.zeros((N+1, N+1), dtype=torch.int32)

padded = torch.zeros((N+1, N+1), dtype=torch.int32, device=input.device)

Copilot · 2026-01-03T21:49:33Z

challenges/medium/55_max_2d_subarray_sum/challenge.py

+            "window_size": 7
+        })
+
+        # increasing_sequence


The comment says "increasing_sequence" but the test generates random integers using torch.randint(-10, 11, ...). This comment is misleading and should be updated to reflect what the test actually does, such as "mixed_positive_negative" or "random_values".

Suggested change

# increasing_sequence

# mixed_positive_negative

Copilot · 2026-01-03T21:49:34Z

challenges/medium/55_max_2d_subarray_sum/starter/starter.mojo

+from math import ceildiv
+
+# input, output are device pointers (i.e. pointers to memory on the GPU)
+@export                         


There is trailing whitespace after the @export decorator. This should be removed for code cleanliness.

Suggested change

@export

@export

Copilot · 2026-01-03T21:49:34Z

challenges/medium/55_max_2d_subarray_sum/starter/starter.cu

+
+// input, output are device pointers (i.e. pointers to memory on the GPU)
+extern "C" void solve(const int* input, int* output, int N, int window_size) {
+


There is trailing whitespace on this line. This should be removed for code cleanliness.

Suggested change

Copilot · 2026-01-03T21:49:34Z

challenges/medium/55_max_2d_subarray_sum/starter/starter.triton.py

@@ -0,0 +1,7 @@
+import torch
+import triton


Import of 'triton' is not used.

Suggested change

import triton

Copilot · 2026-01-03T21:49:35Z

challenges/medium/55_max_2d_subarray_sum/starter/starter.triton.py

@@ -0,0 +1,7 @@
+import torch
+import triton
+import triton.language as tl


Import of 'tl' is not used.

Suggested change

import triton.language as tl

add max 2d subarray sum

bcb9b7f

SungHwanYun requested review from ishaan-arya and kunal-mansukhani as code owners September 10, 2025 22:12

kunal-mansukhani requested a review from Copilot January 3, 2026 21:47

kunal-mansukhani assigned Copilot Jan 3, 2026

Copilot started reviewing on behalf of kunal-mansukhani January 3, 2026 21:47 View session

Copilot AI reviewed Jan 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add max 2d subarray sum #81

add max 2d subarray sum #81

Uh oh!

SungHwanYun commented Sep 10, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 3, 2026

Uh oh!

Copilot AI Jan 3, 2026

Uh oh!

Copilot AI Jan 3, 2026

Uh oh!

Copilot AI Jan 3, 2026

Uh oh!

Copilot AI Jan 3, 2026

Uh oh!

Copilot AI Jan 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	padded = torch.zeros((N+1, N+1), dtype=torch.int32)
	padded = torch.zeros((N+1, N+1), dtype=torch.int32, device=input.device)


		// input, output are device pointers (i.e. pointers to memory on the GPU)
		extern "C" void solve(const int* input, int* output, int N, int window_size) {

add max 2d subarray sum #81

Are you sure you want to change the base?

add max 2d subarray sum #81

Uh oh!

Conversation

SungHwanYun commented Sep 10, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant