对 SUPER 数据类型路径使用动态数据掩蔽 - Amazon Redshift
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅 中国的 Amazon Web Services 服务入门 (PDF)

对 SUPER 数据类型路径使用动态数据掩蔽

Amazon Redshift 支持将动态数据掩蔽策略附加到 SUPER 类型列的路径。有关 SUPER 数据类型的更多信息,请参阅在 Amazon Redshift 中摄取和查询半结构化数据

将掩蔽策略附加到 SUPER 类型列的路径时,请考虑以下几点。

  • 将掩蔽策略附加到列上的路径时,必须将该列定义为 SUPER 数据类型。只能对 SUPER 路径上的标量值应用掩蔽策略。不能将掩蔽策略应用于复杂的结构或数组。

  • 只要 SUPER 路径不冲突,就可以对单个 SUPER 列上的多个标量值应用不同的掩蔽策略。例如,SUPER 路径 a.ba.b.c 冲突,因为它们在同一路径上,a.ba.b.c 的父路径。SUPER 路径 a.b.ca.b.d 不冲突。

  • 在用户查询运行时应用策略之前,Amazon Redshift 无法检查数据中是否存在掩蔽策略所附加的路径,也无法检查这些路径是否属于预期类型。例如,当您将掩蔽 TEXT 值的掩蔽策略附加到包含 INT 值的 SUPER 路径时,Amazon Redshift 会尝试在路径上转换值的类型。

    在这种情况下,Amazon Redshift 在运行时的行为取决于您用于查询 SUPER 对象的配置设置。默认情况下,Amazon Redshift 采用宽松模式,对于给定的 SUPER 路径,会将缺失的路径和无效的转换解析为 NULL。有关 SUPER 相关配置设置的更多信息,请参阅SUPER 配置

  • SUPER 是一种无架构类型,这意味着 Amazon Redshift 无法确认给定 SUPER 路径上的值是否存在。如果您将掩蔽策略附加到不存在的 SUPER 路径并且 Amazon Redshift 采用宽松模式,则 Amazon Redshift 会将该路径解析为 NULL 值。我们建议您在将掩蔽策略附加到 SUPER 列的路径时,考虑 SUPER 对象的预期格式以及它们具有意外属性的可能性。如果您认为 SUPER 列中可能存在意外架构,请考虑将掩蔽策略直接附加到 SUPER 列。您可以使用 SUPER 类型信息函数来检查属性和类型,并使用 OBJECT_TRANSFORM 来掩蔽这些值。有关 SUPER 类型信息函数的更多信息,请参阅SUPER 类型信息函数

示例

将掩蔽策略附加到 SUPER 路径

以下示例在一列中将多个掩蔽策略附加到多个 SUPER 类型路径上。

CREATE TABLE employees ( col_person SUPER ); INSERT INTO employees VALUES ( json_parse(' { "name": { "first": "John", "last": "Doe" }, "age": 25, "ssn": "111-22-3333", "company": "Company Inc." } ') ), ( json_parse(' { "name": { "first": "Jane", "last": "Appleseed" }, "age": 34, "ssn": "444-55-7777", "company": "Organization Org." } ') ) ; GRANT ALL ON ALL TABLES IN SCHEMA "public" TO PUBLIC; -- Create the masking policies. -- This policy converts the given name to all uppercase letters. CREATE MASKING POLICY mask_first_name WITH(first_name TEXT) USING ( UPPER(first_name) ); -- This policy replaces the given name with the fixed string 'XXXX'. CREATE MASKING POLICY mask_last_name WITH(last_name TEXT) USING ( 'XXXX'::TEXT ); -- This policy rounds down the given age to the nearest 10. CREATE MASKING POLICY mask_age WITH(age INT) USING ( (FLOOR(age::FLOAT / 10) * 10)::INT ); -- This policy converts the first five digits of the given SSN to 'XXX-XX'. CREATE MASKING POLICY mask_ssn WITH(ssn TEXT) USING ( 'XXX-XX-'::TEXT || SUBSTRING(ssn::TEXT FROM 8 FOR 4) ); -- Attach the masking policies to the employees table. ATTACH MASKING POLICY mask_first_name ON employees(col_person.name.first) TO PUBLIC; ATTACH MASKING POLICY mask_last_name ON employees(col_person.name.last) TO PUBLIC; ATTACH MASKING POLICY mask_age ON employees(col_person.age) TO PUBLIC; ATTACH MASKING POLICY mask_ssn ON employees(col_person.ssn) TO PUBLIC; -- Verify that your masking policies are attached. SELECT policy_name, TABLE_NAME, priority, input_columns, output_columns FROM svv_attached_masking_policy; policy_name | table_name | priority | input_columns | output_columns -----------------+------------+----------+-----------------------------------+----------------------------------- mask_age | employees | 0 | ["col_person.\"age\""] | ["col_person.\"age\""] mask_first_name | employees | 0 | ["col_person.\"name\".\"first\""] | ["col_person.\"name\".\"first\""] mask_last_name | employees | 0 | ["col_person.\"name\".\"last\""] | ["col_person.\"name\".\"last\""] mask_ssn | employees | 0 | ["col_person.\"ssn\""] | ["col_person.\"ssn\""] (4 rows) -- Observe the masking policies taking effect. SELECT col_person FROM employees ORDER BY col_person.age; -- This result is formatted for ease of reading. col_person -------------------------------- { "name": { "first": "JOHN", "last": "XXXX" }, "age": 20, "ssn": "XXX-XX-3333", "company": "Company Inc." } { "name": { "first": "JANE", "last": "XXXX" }, "age": 30, "ssn": "XXX-XX-7777", "company": "Organization Org." }

以下是附加到 SUPER 路径的无效掩蔽策略的一些示例。

-- This attachment fails because there is already a policy -- with equal priority attached to employees.name.last, which is -- on the same SUPER path as employees.name. ATTACH MASKING POLICY mask_ssn ON employees(col_person.name) TO PUBLIC; ERROR: DDM policy "mask_last_name" is already attached on relation "employees" column "col_person."name"."last"" with same priority -- Create a masking policy that masks DATETIME objects. CREATE MASKING POLICY mask_date WITH(INPUT DATETIME) USING ( INPUT ); -- This attachment fails because SUPER type columns can't contain DATETIME objects. ATTACH MASKING POLICY mask_date ON employees(col_person.company) TO PUBLIC; ERROR: cannot attach masking policy for output of type "timestamp without time zone" to column "col_person."company"" of type "super

以下是将掩蔽策略附加到不存在的 SUPER 路径的示例。默认情况下,Amazon Redshift 将路径解析为 NULL

ATTACH MASKING POLICY mask_first_name ON employees(col_person.not_exists) TO PUBLIC; SELECT col_person FROM employees LIMIT 1; -- This result is formatted for ease of reading. col_person ----------------------------------- { "name": { "first": "JOHN", "last": "XXXX" }, "age": 20, "ssn": "XXX-XX-3333", "company": "Company Inc.", "not_exists": null }