将 OpenSearch Ingestion 管道与 Atlassian 服务结合使用
您可以使用 Atlassian Jira 和 Confluence 源插件,将 Atlassian 服务中的数据摄取到 OpenSearch Ingestion 管道中。这些集成功能可通过同步完整的 Jira 项目和 Confluence 空间,帮助您创建统一的可搜索知识库,同时借助持续监控和自动更新同步功能保持实时相关性。
主题
先决条件
创建 OpenSearch Ingestion 管道之前,请完成以下步骤:
-
选择以下选项之一,为 Jira 站点准备凭证。OpenSearch Ingestion 仅需对内容的
ReadOnly授权。-
选项 1:API 密钥:登录您的 Atlassian 账户,并使用以下主题中的信息生成 API 密钥:
-
选项 2:OAuth2:登录您的 Atlassian 账户,并使用 使用 OAuth 2.0 将 Amazon OpenSearch Ingestion 管道连接到 Atlassian Jira 或 Confluence 中的信息。
-
-
在 Amazon Secrets Manager 中创建密钥,以存储在上一步中创建的凭证。按照步骤进行操作时,作出以下选择:
-
对于密钥类型,请选择其他密钥类型。
-
对于键/值对,根据您选择的授权类型创建以下键值对:
创建密钥后,复制该密钥的 Amazon 资源名称(ARN)。您需要将该密钥包含在管道角色权限策略中。
-
配置管道角色
传递到管道的角色必须附加以下策略,才能读取和写入先决条件部分创建的密钥。
该角色还应附带一项策略,用于访问和写入您选择的接收器。例如,如果您选择 OpenSearch 作为接收器,该策略将类似以下内容:
Jira 连接器管道配置
您可以使用预先配置的 Atlassian Jira 蓝图,以创建此管道。有关更多信息,请参阅 使用蓝图。
将占位符值替换为您自己的信息。
version: "2" extension: aws: secrets: jira-account-credentials: secret_id: "secret-arn" region: "secret-region" sts_role_arn: "arn:aws:iam::123456789012:role/Example-Role" atlassian-jira-pipeline: source: jira: # We only support one host url for now hosts: ["jira-host-url"] acknowledgments: true authentication: # Provide one of the authentication method to use. Supported methods are 'basic' and 'oauth2'. # For basic authentication, password is the API key that you generate using your jira account basic: username: ${{aws_secrets:jira-account-credentials:username}} password: ${{aws_secrets:jira-account-credentials:password}} # For OAuth2 based authentication, we require the following 4 key values stored in the secret # Follow atlassian instructions at the below link to generate these keys. # https://developer.atlassian.com/cloud/confluence/oauth-2-3lo-apps/ # If you are using OAuth2 authentication, we also require, write permission to your Amazon secret to # be able to write the renewed tokens back into the secret. # oauth2: # client_id: ${{aws_secrets:jira-account-credentials:clientId}} # client_secret: ${{aws_secrets:jira-account-credentials:clientSecret}} # access_token: ${{aws_secrets:jira-account-credentials:accessToken}} # refresh_token: ${{aws_secrets:jira-account-credentials:refreshToken}} filter: project: key: include: # This is not project name. # It is an alphanumeric project key that you can find under project details in Jira. - "project-key" - "project-key" # exclude: # - "project-key" # - "project-key" issue_type: include: - "issue-type" # - "Story" # - "Bug" # - "Task" # exclude: # - "Epic" status: include: - "ticket-status" # - "To Do" # - "In Progress" # - "Done" # exclude: # - "Backlog" sink: - opensearch: # Provide an Amazon OpenSearch Service domain endpoint hosts: [ "https://search-mydomain-1a2a3a4a5a6a7a8a9a0a9a8a7a.us-east-1.es.amazonaws.com" ] index: "index_${getMetadata(\"project\")}" # Ensure adding unique document id which is the unique ticket id in this case document_id: '${/id}' aws: # Provide a Role ARN with access to the domain. This role should have a trust relationship with osis-pipelines.amazonaws.com sts_role_arn: "arn:aws:iam::123456789012:role/Example-Role" # Provide the region of the domain. region: "us-east-1" # Enable the 'serverless' flag if the sink is an Amazon OpenSearch Serverless collection serverless: false # serverless_options: # Specify a name here to create or update network policy for the serverless collection # network_policy_name: "network-policy-name" # Enable the 'distribution_version' setting if the Amazon OpenSearch Service domain is of version Elasticsearch 6.x # distribution_version: "es6" # Enable and switch the 'enable_request_compression' flag if the default compression setting is changed in the domain. # See 在 Amazon OpenSearch Service 中压缩 HTTP 请求 # enable_request_compression: true/false # Optional: Enable the S3 DLQ to capture any failed requests in an S3 bucket. Delete this entire block if you don't want a DLQ. dlq: s3: # Provide an S3 bucket bucket: "your-dlq-bucket-name" # Provide a key path prefix for the failed requests # key_path_prefix: "kinesis-pipeline/logs/dlq" # Provide the region of the bucket. region: "us-east-1" # Provide a Role ARN with access to the bucket. This role should have a trust relationship with osis-pipelines.amazonaws.com sts_role_arn: "arn:aws:iam::123456789012:role/Example-Role"
Jira 源中属性的键:
-
主机:Jira 云或本地 URL。通常类似
https://。your-domain-name.atlassian.net/ -
确认:确保数据能够完整无误地传输至接收端。
-
身份验证:描述您希望管道访问 Jira 实例的方式。选择
Basic或OAuth2,并指定相应的键属性,这些键属性需参照 Amazon 密钥中的键。 -
筛选器:此部分可帮助您选择要提取和同步的 Jira 数据范围。
-
项目:在
include部分列出要同步的项目密钥。否则,在exclude部分下方列出要排除的项目。每次仅提供包含或排除选项中的一种。 -
issue_type:需要同步的特定问题类型。根据需要类似的
include或exclude模式。请注意,附件将显示为指向原始附件的锚点链接,但不会提取附件内容。 -
状态:要应用于数据提取查询的特定状态筛选器。如果指定
include,则仅同步具有这些状态的票证。如果指定exclude,则所有票证(除列出的处于排除状态的票证外)都将同步。
-
Confluence 连接器管道配置
您可以使用预先配置的 Atlassian Confluence 蓝图,以创建此管道。有关更多信息,请参阅 使用蓝图。
version: "2" extension: aws: secrets: confluence-account-credentials: secret_id: "secret-arn" region: "secret-region" sts_role_arn: "arn:aws:iam::123456789012:role/Example-Role" atlassian-confluence-pipeline: source: confluence: # We currently support only one host URL. hosts: ["confluence-host-url"] acknowledgments: true authentication: # Provide one of the authentication method to use. Supported methods are 'basic' and 'oauth2'. # For basic authentication, password is the API key that you generate using your Confluence account basic: username: ${{aws_secrets:confluence-account-credentials:confluenceId}} password: ${{aws_secrets:confluence-account-credentials:confluenceCredential}} # For OAuth2 based authentication, we require the following 4 key values stored in the secret # Follow atlassian instructions at the following link to generate these keys: # https://developer.atlassian.com/cloud/confluence/oauth-2-3lo-apps/ # If you are using OAuth2 authentication, we also require write permission to your Amazon secret to # be able to write the renewed tokens back into the secret. # oauth2: # client_id: ${{aws_secrets:confluence-account-credentials:clientId}} # client_secret: ${{aws_secrets:confluence-account-credentials:clientSecret}} # access_token: ${{aws_secrets:confluence-account-credentials:accessToken}} # refresh_token: ${{aws_secrets:confluence-account-credentials:refreshToken}} filter: space: key: include: # This is not space name. # It is a space key that you can find under space details in Confluence. - "space key" - "space key" # exclude: # - "space key" # - "space key" page_type: include: - "content type" # - "page" # - "blogpost" # - "comment" # exclude: # - "attachment" sink: - opensearch: # Provide an Amazon OpenSearch Service domain endpoint hosts: [ "https://search-mydomain-1a2a3a4a5a6a7a8a9a0a9a8a7a.us-east-1.es.amazonaws.com" ] index: "index_${getMetadata(\"space\")}" # Ensure adding unique document id which is the unique ticket ID in this case. document_id: '${/id}' aws: # Provide the Amazon Resource Name (ARN) for a role with access to the domain. This role should have a trust relationship with osis-pipelines.amazonaws.com. sts_role_arn: "arn:aws:iam::123456789012:role/Example-Role" # Provide the Region of the domain. region: "us-east-1" # Enable the 'serverless' flag if the sink is an Amazon OpenSearch Serverless collection serverless: false # serverless_options: # Specify a name here to create or update network policy for the serverless collection. # network_policy_name: "network-policy-name" # Enable the 'distribution_version' setting if the Amazon OpenSearch Service domain is of version Elasticsearch 6.x # distribution_version: "es6" # Enable and switch the 'enable_request_compression' flag if the default compression setting is changed in the domain. # For more information, see 在 Amazon OpenSearch Service 中压缩 HTTP 请求. # enable_request_compression: true/false # Optional: Enable the S3 DLQ to capture any failed requests in an S3 bucket. Delete this entire block if you don't want a DLQ. dlq: s3: # Provide an S3 bucket bucket: "your-dlq-bucket-name" # Provide a key path prefix for the failed requests # key_path_prefix: "kinesis-pipeline/logs/dlq" # Provide the Rregion of the bucket. region: "us-east-1" # Provide the Amazon Resource Name (ARN) for a role with access to the bucket. This role should have a trust relationship with osis-pipelines.amazonaws.com sts_role_arn: "arn:aws:iam::123456789012:role/Example-Role"
Confluence 源中的键属性:
-
主机:Confluence 云或本地 URL。通常类似
https://your-domain-name.atlassian.net/ -
确认:确保数据能够完整无误地传输至接收端。
-
身份验证:描述您希望管道访问 Confluence 实例的方式。选择
Basic或OAuth2,并指定相应的键属性,这些键属性需参照 Amazon 密钥中的键。 -
筛选器:此部分可帮助您选择要提取和同步的 Confluence 数据范围。
-
空间:在
include部分列出要同步的空间密钥。否则,在exclude部分下方列出要排除的空间。每次仅提供包含或排除选项中的一种。 -
page_type:希望同步的特定页面类型(如页面、博客文章或附件)。根据需要类似的
include或exclude模式。请注意,附件将显示为指向原始附件的锚点链接,但不会提取附件内容。
-
数据一致性
根据管道 YAML 中指定的筛选器,选定的项目(或空间)将被提取一次,并完全同步到目标接收器。然后,持续变更监控将实时捕获变更并更新接收端的数据。一个例外情况是,变更监控仅同步 create 和 update 操作,而不会同步 delete 操作。
限制
-
用户删除操作不会同步。数据一旦记录在接收器中,就会保留在接收器中。如果在接收器设置中指定了 ID 映射,更新操作将使用新更改覆盖现有内容。
-
使用旧版本 Atlassian 软件的本地部署实例如果不支持以下 API,则与此源不兼容:
-
Jira Search API 版本 3
-
rest/api/3/search -
rest/api/3/issue
-
-
Confluence
-
wiki/rest/api/content/search -
wiki/rest/api/content -
wiki/rest/api/settings/systemInfo
-
-
CloudWatch 中的 Atlassian 连接器指标
类型:Jira 连接器指标
| 来源 | 指标 | 指标类型 |
|---|---|---|
| acknowledgementSetSuccesses.count | 计数器 | 如果已启用确认功能,此指标将提供同步成功的票证数量。 |
| acknowledgementSetFailures.count | 计数器 | 如果已启用确认功能,此指标将提供同步失败的票证数量。 |
| crawlingTime.avg | 计时器 | 浏览所有新增内容所需的时间。 |
| ticketFetchLatency.avg | 计时器 | 票证获取 API 延迟平均值。 |
| ticketFetchLatency.max | 计时器 | 票证获取 API 延迟最大值。 |
| ticketsRequested.count | 计数器 | 已发送的票证获取请求数量。 |
| ticketRequestedFailed.count | 计数器 | 未发送的票证获取请求数量。 |
| ticketRequestedSuccess.count | 计数器 | 成功的票证获取请求数量。 |
| searchCallLatency.avg | 计时器 | 搜索 API 调用延迟平均值。 |
| searchCallLatency.max | 计时器 | 搜索 API 调用延迟最大值。 |
| searchResultsFound.count | 计数器 | 给定搜索调用中找到的项目数。 |
| searchRequestFailed.count | 计数器 | 搜索 API 调用失败计数。 |
| authFailures.count | 计数器 | 身份验证失败计数。 |
类型:Confluence 连接器指标
| 来源 | 指标 | 指标类型 |
|---|---|---|
| acknowledgementSetSuccesses.count | 计数器 | 如果已启用确认功能,此指标将提供同步成功的页面数量。 |
| acknowledgementSetFailures.count | 计数器 | 如果已启用确认功能,此指标将提供同步失败的页面数量。 |
| crawlingTime.avg | 计时器 | 浏览所有新增内容所需的时间。 |
| pageFetchLatency.avg | 计时器 | 内容获取 API 延迟(平均值)。 |
| pageFetchLatency.max | 计时器 | 内容获取 API 延迟(最大值)。 |
| pagesRequested.count | 计数器 | 内容获取 API 的调用次数。 |
| pageRequestFailed.count | 计数器 | 内容获取 API 的失败请求数。 |
| pageRequestedSuccess.count | 计数器 | 内容获取 API 的成功请求数。 |
| searchCallLatency.avg | 计时器 | 搜索 API 调用延迟平均值。 |
| searchCallLatency.max | 计时器 | 搜索 API 调用延迟最大值 |
| searchResultsFound.count | 计数器 | 给定搜索调用中找到的项目数。 |
| searchRequestsFailed.count | 计数器 | 搜索 API 调用失败计数。 |
| authFailures.count | 计数器 | 身份验证失败计数。 |