Alibaba Cloud MaxCompute (previously known as ODPS) is a general purpose, fully managed, multi-tenancy data processing platform for large-scale data warehousing. MaxCompute supports various data importing solutions and distributed computing models, enabling users to effectively query massive datasets, reduce production costs, and ensure data security.The
MaxComputeLoader
lets you execute a MaxCompute SQL query and loads the results as one document per row.
Basic Usage
To instantiate the loader you’ll need a SQL query to execute, your MaxCompute endpoint and project name, and your access ID and secret access key. The access ID and secret access key can either be passed in direct via theaccess_id
and secret_access_key
parameters or they can be set as environment variables MAX_COMPUTE_ACCESS_ID
and MAX_COMPUTE_SECRET_ACCESS_KEY
.
Specifying Which Columns are Content vs Metadata
You can configure which subset of columns should be loaded as the contents of the Document and which as the metadata using thepage_content_columns
and metadata_columns
parameters.