DropFields 类 - AWS Glue
AWS 文档中描述的 AWS 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅中国的 AWS 服务入门

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

DropFields 类

删除 DynamicFrame 中的字段。

Methods

__call__(frame, paths, transformation_ctx = "", info = "", stageThreshold = 0, totalThreshold = 0)

删除 DynamicFrame 中的字节。

  • frame – 要在其中删除节点的 DynamicFrame (必需)。

  • paths – 要删除的节点的完整路径的列表 (必需)。

  • transformation_ctx – 用于标识状态信息的唯一字符串 (可选)。

  • info – 与转换中的错误关联的字符串 (可选)。

  • stageThreshold – 在转换出错之前可能在其中发生的最大错误数 (可选;默认值为零)。

  • totalThreshold – 在处理出错之前可能全面发生的最大错误数 (可选;默认值为零)。

返回不包含指定字段的新 DynamicFrame

apply(cls, *args, **kwargs)

继承自 GlueTransform apply

name(cls)

继承自 GlueTransform name

describeArgs(cls)

继承自 GlueTransform describeArgs

describeReturn(cls)

继承自 GlueTransform describeReturn

describeTransform(cls)

继承自 GlueTransform describeTransform

describeErrors(cls)

继承自 GlueTransform describeErrors

describe(cls)

继承自 GlueTransform 描述

Examples

用于 DropFields eld 示例的数据集

以下数据集用于 DropFields 示例:

{name: Sally, age: 23, location: {state: WY, county: Fremont}, friends: []} {name: Varun, age: 34, location: {state: NE, county: Douglas}, friends: [{name: Arjun, age: 3}]} {name: George, age: 52, location: {state: NY}, friends: [{name: Fred}, {name: Amy, age: 15}]} {name: Haruki, age: 21, location: {state: AK, county: Denali}} {name: Sheila, age: 63, friends: [{name: Nancy, age: 22}]}

此数据集具有以下架构:

root |-- name: string |-- age: int |-- location: struct | |-- state: string | |-- county: string |-- friends: array | |-- element: struct | | |-- name: string | | |-- age: int

例如:删除顶级字段

使用类似于以下内容的代码删除age字段:

df_no_age = DropFields.apply(df, paths=['age'])

生成的数据集:

{name: Sally, location: {state: WY, county: Fremont}, friends: []} {name: Varun, location: {state: NE, county: Douglas}, friends: [{name: Arjun, age: 3}]} {name: George, location: {state: NY}, friends: [{name: Fred}, {name: Amy, age: 15}]} {name: Haruki, location: {state: AK, county: Denali}} {name: Sheila, friends: [{name: Nancy, age: 22}]}

生成的架构:

root |-- name: string |-- location: struct | |-- state: string | |-- county: string |-- friends: array | |-- element: struct | | |-- name: string | | |-- age: int

例如:删除嵌套字段

要删除嵌套字段,您可以使用'.'

df_no_county = DropFields.apply(df, paths=['location.county'])

生成的数据集:

{name: Sally, age: 23, location: {state: WY}, friends: []} {name: Varun, age: 34, location: {state: NE}, friends: [{name: Arjun, age: 3}]} {name: George, age: 52, location: {state: NY}, friends: [{name: Fred}, {name: Amy, age: 15}]} {name: Haruki, age: 21, location: {state: AK}} {name: Sheila, age: 63, friends: [{name: Nancy, age: 22}]}

如果您删除了struct类型,则转换会删除整个struct

df_no_county = DropFields.apply(df, paths=['location.state])

生成的架构:

root |-- name: string |-- age: int |-- friends: array | |-- element: struct | | |-- name: string | | |-- age: int

例如:从数组中删除嵌套字段

不需要特殊的语法来从struct嵌套在一个array。例如,我们可以删除age字段,从friends数组,其中包含以下内容:

df_no_friend_age = DropFields.apply(df, paths=['friends.age'])

生成的数据集:

{name: Sally, age: 23, location: {state: WY, county: Fremont}} {name: Varun, age: 34, location: {state: NE, county: Douglas}, friends: [{name: Arjun}]} {name: George, age: 52, location: {state: NY}, friends: [{name: Fred}, {name: Amy}]} {name: Haruki, age: 21, location: {state: AK, county: Denali}} {name: Sheila, age: 63, friends: [{name: Nancy}]}

生成的架构:

root |-- name: string |-- age: int |-- location: struct | |-- state: string | |-- county: string |-- friends: array | |-- element: struct | | |-- name: string