Configuring a data source in OpenSearch Dashboards - Amazon OpenSearch Service
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Configuring a data source in OpenSearch Dashboards

Now that you've created your data source, you can configure security settings, define your Amazon S3 tables, or set up accelerated data indexing. This section walks you through various use cases with your data source in OpenSearch Dashboards before you query your data.

To configure the following sections, you must first navigate to your data source in OpenSearch Dashboards. In the left-hand navigation, under Management, choose Data sources. Under Manage data sources, select the name of the data source that you created in the console.

Set up access control

On the details page for your data source, find the Access controls section and choose Edit. If you have the security plugin installed, choose Restricted and select which role-based groups you want to provide with access to the new data source. You can also choose Admin only if you only want the administrator to have access to the data source.

Important

Indexes are used for any queries against the data source. A user with read access to the request index for a given data source can read all queries against that data source. A user with read access to the result index can read results for all queries against that data source.

Setup integrations for popular Amazon log types

OpenSearch Dashboards makes it easy to quickly get started using common log types stored in Amazon S3 using the Apache Parquet format. OpenSearch Dashboards offers integrations that install access to Amazon Glue Data Catalog tables, saved queries, and dashboards.You can set up integrations from the data source details page or the left navigation. To do this:

  1. Select the log type you want to install. Make sure the log type you install has the Amazon S3 tag.

  2. Select the connection type as Amazon S3 connection if not already selected.

  3. Select the data source name you want to install the integration on, the Amazon S3 location for the data, the checkpoint you want to use to maintain acceleation indexing status, and the desired assets based on your use case.

    Note

    When creating the IAM role, you specified an Amazon S3 resource for a checkpoint that has write action permissions for the checkpoint location. You will need to reference an Amazon S3 bucket location that has write access for the checkpoint location. If you don't, the accelerations that the integration will install will fail.

    Note

    Amazon VPC flow log integration requires a patch to be installed using OpenSearch Dashboards. It may take a few minutes to populate the dashboards you've installed.

Create Spark Tables using Query Workbench

Direct queries from OpenSearch Service to Amazon S3 use Spark tables within the Amazon Glue Data Catalog. You can create tables from within the Query Workbench without having to leave OpenSearch.

To manage existing databases and tables in your data source, or to create new tables that you want to use direct queries on, choose select Query Workbench from the left navigation and select the Amazon S3 data source from the data source drop down.

To set up a table for VPC Flow logs stored in S3 in Parquet format, run the following query:

CREATE TABLE datasourcename.gluedatabasename.vpclogstable ( version INT, account_id STRING, interface_id STRING, srcaddr STRING, dstaddr STRING, srcport INT, dstport INT, protocol INT, packets BIGINT, bytes BIGINT, start BIGINT, end BIGINT, action STRING, log_status STRING, `aws-account-id` STRING, `aws-service` STRING, `aws-region` STRING, year STRING, month STRING, day STRING, hour STRING) USING parquet PARTITIONED BY (aws-account-id, aws-service, aws-region, year, month, day, hour) LOCATION "s3://accountnum-vpcflow/AWSLogs"

After creating the table, run the following query to ensure that it's compatible with direct queries:

MSCK REPAIR TABLE datasourcename.databasename.table