DropFields 类
删除 DynamicFrame
中的字段。
Methods
__call__(frame, paths, transformation_ctx = "", info = "", stageThreshold = 0, totalThreshold = 0)
删除 DynamicFrame
中的字节。
frame
– 要在其中删除节点的DynamicFrame
(必需)。paths
– 要删除的节点的完整路径的列表 (必需)。transformation_ctx
– 用于标识状态信息的唯一字符串 (可选)。info
– 与转换中的错误关联的字符串 (可选)。stageThreshold
– 在转换出错之前可能在其中发生的最大错误数 (可选;默认值为零)。totalThreshold
– 在处理出错之前可能全面发生的最大错误数 (可选;默认值为零)。
返回不包含指定字段的新 DynamicFrame
。
apply(cls, *args, **kwargs)
继承自 GlueTransform
apply。
name(cls)
继承自 GlueTransform
名称。
describeArgs(cls)
继承自 GlueTransform
describeArgs。
describeReturn(cls)
继承自 GlueTransform
describeReturn。
describeTransform(cls)
继承自 GlueTransform
describeTransform。
describeErrors(cls)
继承自 GlueTransform
describeErrors。
describe(cls)
继承自 GlueTransform
描述。
Examples
DropFields 示例使用的数据集
以下数据集用于 DropFields 示例:
{name: Sally, age: 23, location: {state: WY, county: Fremont}, friends: []} {name: Varun, age: 34, location: {state: NE, county: Douglas}, friends: [{name: Arjun, age: 3}]} {name: George, age: 52, location: {state: NY}, friends: [{name: Fred}, {name: Amy, age: 15}]} {name: Haruki, age: 21, location: {state: AK, county: Denali}} {name: Sheila, age: 63, friends: [{name: Nancy, age: 22}]}
此数据集具有以下架构:
root |-- name: string |-- age: int |-- location: struct | |-- state: string | |-- county: string |-- friends: array | |-- element: struct | | |-- name: string | | |-- age: int
示例:删除顶级字段
使用类似如下的代码来删除 age
字段:
df_no_age = DropFields.apply(df, paths=['age'])
生成的数据集:
{name: Sally, location: {state: WY, county: Fremont}, friends: []} {name: Varun, location: {state: NE, county: Douglas}, friends: [{name: Arjun, age: 3}]} {name: George, location: {state: NY}, friends: [{name: Fred}, {name: Amy, age: 15}]} {name: Haruki, location: {state: AK, county: Denali}} {name: Sheila, friends: [{name: Nancy, age: 22}]}
生成的架构:
root |-- name: string |-- location: struct | |-- state: string | |-- county: string |-- friends: array | |-- element: struct | | |-- name: string | | |-- age: int
示例:删除嵌套字段
要删除嵌套字段,您可以使用 '.'
限定字段。
df_no_county = DropFields.apply(df, paths=['location.county'])
生成的数据集:
{name: Sally, age: 23, location: {state: WY}, friends: []} {name: Varun, age: 34, location: {state: NE}, friends: [{name: Arjun, age: 3}]} {name: George, age: 52, location: {state: NY}, friends: [{name: Fred}, {name: Amy, age: 15}]} {name: Haruki, age: 21, location: {state: AK}} {name: Sheila, age: 63, friends: [{name: Nancy, age: 22}]}
如果您删除 struct
类型的最后一个元素,则转换会删除整个 struct
。
df_no_county = DropFields.apply(df, paths=['location.state])
生成的架构:
root |-- name: string |-- age: int |-- friends: array | |-- element: struct | | |-- name: string | | |-- age: int
示例:从数组中删除嵌套字段
要从 array
内嵌套的 struct
内删除字段,无需特殊语法。例如,我们可以使用以下语法从
数组中删除 friends
字段:age
df_no_friend_age = DropFields.apply(df, paths=['friends.age'])
生成的数据集:
{name: Sally, age: 23, location: {state: WY, county: Fremont}} {name: Varun, age: 34, location: {state: NE, county: Douglas}, friends: [{name: Arjun}]} {name: George, age: 52, location: {state: NY}, friends: [{name: Fred}, {name: Amy}]} {name: Haruki, age: 21, location: {state: AK, county: Denali}} {name: Sheila, age: 63, friends: [{name: Nancy}]}
生成的架构:
root |-- name: string |-- age: int |-- location: struct | |-- state: string | |-- county: string |-- friends: array | |-- element: struct | | |-- name: string
DropFields 示例
以下示例中 .zip
的两边必需反引号(`),因为列名称包含句点(.)。
dyf_dropfields = DropFields.apply(frame = dyf_join, paths = "`.zip`")