# With IRSA

Specific for self-hosted from AWS Marketplace, the delegated permission method being used would be through [IAM Roles for Service Accounts (IRSA)](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/setting-up-enable-IAM.html). Although the overall approach is almost the same as the original approach ([parent page](/integrations/external-object-storage/aws-s3.md)), this page is still needed to avoid any confusion between the two and make it clear for the self-hosted users through AWS Marketplace. One of the main differences is when creating the IAM role, specifically the 4th step.

## File Key

This attribute will be used when you create a project to tell Datasaur which file should be used. You can get it by using the path after bucket name on S3 URI. See the example below.

* Bucket name: `datasaur-test`
* S3 URI: `s3://datasaur-test/some-folder/image.png`
* File key: `/some-folder/image.png`

## Setup

By integrating your bucket into Datasaur, you would be able to create projects using files directly from your S3.

#### 1. Setup External Object Storage Integration in Datasaur Team Settings

Let's begin by setting up an Integration in Team Settings. By default, Datasaur uses its own storage to manage your projects. By adding another one, we can use your preferred storage provider when creating projects.

1. Open your team page, then go to **Settings** > **Integrations**.
2. Click **Add external object storage**. A new window will pop up. **Do not close the pop up** because we will use the External ID and it will be generated each time you close the form.
3. You can start by filling the **Bucket name** attribute. It will be used to reference and differentiate between external object storage.

<figure><img src="/files/cu5czFtzBKBucd36QTES" alt=""><figcaption></figcaption></figure>

We'll get back to this window later. Let's leave it for now.

#### 2. Setup CORS for your S3 bucket

This step would allow Datasaur to access resources in your bucket.

1. Log into your AWS account, then go to **S3** management console.
2. Click on your preferred bucket. And also, it's highly recommended to enable the lifecycle policy for both `temp/` and `export/` prefix to be removed in 7 days.
3. Open **Permissions**. Edit the **Cross-origin resource sharing (CORS)** section, and paste the following configurations.

```
[
  { 
    "AllowedHeaders": ["*"], 
    "AllowedMethods": [
      "GET",
      "PUT",
      "POST",
      "HEAD",
      "DELETE"
    ],
    "AllowedOrigins": ["<FILL_THIS_WITH_YOUR_DOMAIN>"],
    "ExposeHeaders": []
  }
]
```

* Bucket name: Fill with the name of the bucket that you just set the CORS for.
* Bucket prefix: It will be added at the start of the bucket so that you can group it according to your needs, e.g. `test` will refer to `/{bucket-name}/test`.
* Allowed origins: Change it to your self-hosted domain for the Datasaur app.

#### 3. Create a policy for Datasaur role in AWS

You need to create a policy to access your S3 bucket. If you have already setup a policy for accessing the bucket, feel free to skip this step.

1. In your AWS **IAM management console**, go to **Policies**, then click on **Create Policy**.
2. Choose the JSON tab, and paste the following configurations. Don't forget to **replace the resource with your bucket name**. The write permission will be used to upload the selected files to your bucket whereas the get bucket location will be used to configure the request based on your bucket's region.

   <pre><code><strong>{
   </strong>  "Version": "2012-10-17",
     "Statement": [
       {
         "Action": [
           "s3:ListBucket",
           "s3:ListBucketVersions",
           "s3:PutObjectAcl",
           "s3:PutObject",
           "s3:GetObjectAcl",
           "s3:GetObject",
           "s3:DeleteObjectVersion",
           "s3:DeleteObject",
           "s3:GetBucketLocation"
         ],
         "Effect": "Allow",
         "Resource": [
           "arn:aws:s3:::&#x3C;your-bucket-name>/*",
           "arn:aws:s3:::&#x3C;your-bucket-name>"
         ]
       }
     ]
   }
   </code></pre>
3. Click on **Next: Tags**. We don't require tags to be added, but you can add tags here if you want.
4. Click on **Next: Review**. Input a name for the AWS Policy, a description (optional), and click on **Create Policy.**

#### 4. Create a role for Datasaur

After we've created a policy for your S3 bucket, we need to attach it to a role which will be assumed by Datasaur to access your bucket.

<figure><img src="/files/9YqIVXLNBXmq4brnlKdn" alt=""><figcaption><p>Figure 2: Creating a role</p></figcaption></figure>

1. Back on the **IAM management console**, go to **Roles**, then click on **Create role**.
2. Choose **AWS account** in the trusted entity type section.
3. Click on the **Custom trust policy** for the trusted entity type attribute. You can then paste this configuration below.

   ```json
   {
     "Version": "2012-10-17",
     "Statement": [
       {
         "Effect": "Allow",
         "Principal": {
           "AWS": "arn:aws:iam::<DATASAUR_AWS_ACCOUNT_ID>:role/<IRSA_ROLE_NAME>"
         },
         "Action": "sts:AssumeRole",
         "Condition": {
           "StringEquals": {
             "sts:ExternalId": "<YOUR_EXTERNAL_ID>"
           }
         }
       }
     ]
   }
   ```
4. Replace the values for AWS account ID, IRSA role name, and external ID accordingly. Use the displayed AWS Account ID. You can define your own external ID, just be sure to update the value in the external object storage form.
5. In the **Add permissions** section, pick the policy that we've just created from the previous step. Then, click on **Next**.
6. Input a name, (optional) a description, and click on **Create role**.
7. After that, back on the Roles page, click on your newly created role.
8. Copy the **Role ARN** from the page and paste it in Datasaur Team Settings Page.

#### 5. Check connection

Before you create the integration, you do a check connection to make sure your setup is done correctly. If it's a success, you can continue to create the external object storage.

#### 6. Good to go!

Now, you will be able to create projects using files directly from your S3 bucket, and also change the default object storage to whichever one you want from **Settings** page.

If you have any questions or comments, please let us know, and we'll be happy to support you.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.datasaur.ai/integrations/external-object-storage/aws-s3/with-irsa.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
