Amazon EMR 6.11.0 - Hive release notes - Amazon EMR
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Amazon EMR 6.11.0 - Hive release notes

Amazon EMR 6.11.0 - Hive changes

Type Description
Improvement Added support for multithreaded dropping of partitions to improve the performance of dropping of partitions
Improvement Support reading encoded Hive query files
Improvement Enabled Tez Shuffle Handler by default for Hive on Tez jobs
Bug Added an option to enable deterministic distribution of keys to reducers to fix incorrect result when hive.groupby.skewindata is enabled (reported in HIVE-20220)
Bug Fixed stats computation failure when default partition name is configured
Bug Respect any custom SSL classification parameters passed when SSL is configured out of the box for HiveServer2 in a cluster with in-transit encryption enabled
Backport HIVE-23617: Fixed storage-api FindBug issues
Backport HIVE-26408: Vectorization: Fix deallocation of scratch columns, don't reuse a child ConstantVectorExpression as an output
Backport HIVE-23614: Always pass HiveConfig to removeTempOrDuplicateFiles
Backport HIVE-23354: Remove file size sanity checking from compareTempOrDuplicateFiles
Backport HIVE-20344: Fixed PrivilegeSynchronizer for SBA throwing AccessControlException. Also introduced property hive.privilege.synchronizer to disable privilege synchronizer
Backport HIVE-15826: Support configuring 'serialization.encoding' for all SerDes
Backport HIVE-18284: Fix NPE when inserting data with 'distribute by' clause with dynpart sort optimization
Backport HIVE-24930: Operator.setDone() short-circuit from child op is not used in vectorized codepath (if childSize == 1)
Backport HIVE-24523: Vectorized read path for LazySimpleSerde does not honor the SERDEPROPERTIES for timestamp
Backport HIVE-23265: Duplicate rowsets are returned with Limit and Offset set
Backport HIVE-21492: VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
Backport HIVE-22540: Vectorization: Decimal64 columns don't work with VectorizedBatchUtil.makeLikeColumnVector()
Backport HIVE-22588: Flush the remaining rows for the rest of the grouping sets when switching the vector groupby mode
Backport HIVE-22551: BytesColumnVector initBuffer should clean vector and length consistently
Backport HIVE-22448: CBO: Expand the multiple count distinct with a group-by key
Backport HIVE-22248: Fix statistics persisting issues
Backport HIVE-22210: Vectorization may reuse computation output columns involved in filtering
Backport HIVE-21531: Vectorization: all NULL hashcodes are not computed using Murmur3
Backport HIVE-20419: Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key
Backport HIVE-19388: ClassCastException during VectorMapJoinCommonOperator initialization
Backport HIVE-21584: Java 11 preparation: system class loader is not URLClassLoader
Backport HIVE-25107: Classpath logging should be on DEBUG level (#2271)
Backport HIVE-22097: Incompatible java.util.ArrayList for java 11
Backport HIVE-23938: LLAP: JDK11 - some GC log file rotation related jvm arguments cannot be used anymore
Backport HIVE-26226: Exclude jdk.tools dep from hive-metastore in upgrade-acid
Backport HIVE-17879: Upgrade Datanucleus Maven Plugin
Backport HIVE-27004: DateTimeFormatterBuilder#appendZoneText cannot parse 'UTC+' in Java versions higher than 8
Backport HIVE-16812: VectorizedOrcAcidRowBatchReader doesn't filter delete events
Backport HIVE-17917: VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization
Backport HIVE-19985: ACID: Skip decoding the ROW__ID sections for read-only queries
Backport HIVE-20635: VectorizedOrcAcidRowBatchReader doesn't filter delete events for original files
Upgrade Upgrade Javadoc to 3.3.1
Upgrade Upgrade Javassist to 3.24.1-GA
Upgrade Update apache-directory-server to 2.0.0-M14

New configurations

Name Classification Description
hive.metastore.fs.drop.partition.threads hive-site Number of core threads in the drop partition thread pool.
hive.metastore.fs.drop.partition.keepalive.time hive-site Time in seconds that an idle drop partition async thread (from the thread pool) will wait for a new task to arrive before terminating.
hive.metastore.fs.drop.partition.threadpool.max.queue.size hive-site Maximum queue size to be used in thread pool for dropping of partitions from file system.
hive.groupby.enable.deterministic.distribution hive-site Enable deterministic distribution of keys to reducers. It will pass a constant seed value while calling the rand function used for random partitioning.
hive.privilege.synchronizer hive-site Whether to synchronize privileges from external authorizer periodically in HiveServer2.
hive.cli.query.file.encoding hive-site File encoding for the all type of query files (query file, init query file, rc file etc) provided in the cli arguments.
hive.emr.tez.shuffle.enabled hive-site Hive on Tez jobs now use tez_shuffle by default instead of mapreduce_shuffle as the default Shuffle Handler.

Deprecated configurations

The following configuration properties are deprecated as a result of HIVE-23354 and are no longer supported with Amazon EMR releases 6.11.0 and higher.

Name Default value

hive.mapred.reduce.tasks.speculative.execution

false

tez.am.speculation.enabled

false