Adding JAR files and custom Spark configuration
When you create or edit a session in Amazon Athena for Apache Spark, you can use Spark
properties.jar
files, packages, or another
custom configuration for the session. To specify your Spark properties, you can use the
Athena console, the Amazon CLI, or the Athena API.
Using the Athena console to specify Spark properties
In the Athena console, you can specify your Spark properties when you create a notebook or edit a current session.
To add properties in the Create notebook or Edit session details dialog box
-
Expand Spark properties.
-
To add your properties, use the Edit in table or Edit in JSON option.
-
For the Edit in table option, choose Add property to add a property, or choose Remove to remove a property. Use the Key and Value boxes to enter property names and their values.
-
To add a custom
.jar
file, use thespark.jars
property. -
To specify a package file, use the
spark.jars.packages
property.
-
-
To enter and edit your configuration directly, choose the Edit in JSON option. In the JSON text editor, you can perform the following tasks:
-
Choose Copy to copy the JSON text to the clipboard.
-
Choose Clear to remove all text from the JSON editor.
-
Choose the settings (gear) icon to configure line wrapping or choose a color theme for the JSON editor.
-
-
Notes
-
You can set properties in Athena for Spark, which is the same as setting Spark properties
directly on a SparkConf object. -
Start all Spark properties with the
spark.
prefix. Properties with other prefixes are ignored. -
Not all Spark properties are available for custom configuration on Athena. If you submit a
StartSession
request that has a restricted configuration, the session fails to start.-
You cannot use the
spark.athena.
prefix because it is reserved.
-
Using the Amazon CLI or Athena API to provide custom configuration
To use the Amazon CLI or Athena API to provide your session configuration, use the StartSession API action or the
start-sessionStartSession
request, use
the SparkProperties
field of EngineConfiguration object
to pass your configuration information in JSON format. This starts a session with your
specified configuration. For request syntax, see StartSession in the Amazon Athena API Reference.
Troubleshooting session start errors
When a custom configuration error occurs during a session start, the Athena for Spark console shows an error message banner. To troubleshoot session start errors, you can check session state change or logging information.
Viewing session state change information
You can get details about a session state change from the Athena notebook editor or from the Athena API.
To view session state information in the Athena console
-
In the Athena notebook editor, from the Session menu on the upper right, choose View details.
-
View the Current session tab. The Session information section shows you information like session ID, workgroup, status, and state change reason.
The following screen capture example shows information in the State change reason section of the Session information dialog box for a Spark session error in Athena.
To view session state information using the Athena API
-
In the Athena API, you can find session state change information in the
StateChangeReason
field of SessionStatus object.
Note
After you manually stop a session, or if the session stops after an idle
timeout (the default is 20 minutes), the value of
StateChangeReason changes to Session was
terminated per request
.
Using logging to troubleshoot session start errors
Custom configuration errors that occur during a session start are logged by Amazon CloudWatchAthenaSparkSessionErrorLogger
to troubleshoot a failed session
start.
For more information about Spark logging, see Logging Spark application events in Athena
For more information about troubleshooting sessions in Athena for Spark, see Troubleshooting sessions.