Skip to content

Conversation

@tonnitommi
Copy link
Contributor

Description

Snowflake Doc AI has a new "one shot parse" function. This action pack exposes it as an action (along with list files in stage helper action). Supports only digital documents like PDF, PPTX and DOCX for now, but later can be expanded to support OCRable docs.

In the whitelist file this is only published to Team Edition connected Studios.

How can (was) this tested?

Locally in Cursor and Studio with ~10 different docs.

Checklist:

  • I have bumped the version number for the Action Package / Agent
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation - README.md file
  • I have updated the CHANGELOG.md file in correspondence with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • OAuth2: I have made sure that action has necessary scopes (works in whitelisted mode)

@tonnitommi tonnitommi requested a review from kariharju October 16, 2025 15:06
Comment on lines +99 to +106
# Validate supported file types (Snowflake AI_PARSE_DOCUMENT supports .pdf, .docx, .pptx)
supported_extensions = {'.pdf', '.docx', '.pptx'}
if file_ext_lower not in supported_extensions:
unsupported_msg = (
f"Unsupported file type '{file_ext}'. Only PDF, DOCX, and PPTX are supported "
f"by AI_PARSE_DOCUMENT for this action. See docs: https://docs.snowflake.com/en/user-guide/snowflake-cortex/parse-document"
)
return Response(error=unsupported_msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite bulletproof.
Typically these kind of checks should be done over MIME.
But I guess it's good enough for now.

# Execute PUT command with the correctly named file
put_command = f"PUT 'file://{correct_name_path}' '{stage_location}' OVERWRITE=TRUE AUTO_COMPRESS=FALSE SOURCE_COMPRESSION=NONE"
print(f"[{datetime.datetime.now().strftime('%H:%M:%S.%f')}] Executing PUT command to upload file...")
print("Uploading file to stage...")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reason for removing timestamp?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants