Adding JAR files and custom Spark configuration - Amazon Athena
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Adding JAR files and custom Spark configuration

When you create or edit a session in Amazon Athena for Apache Spark, you can use Spark properties to specify .jar files, packages, or another custom configuration for the session. To specify your Spark properties, you can use the Athena console, the Amazon CLI, or the Athena API.

Using the Athena console to specify Spark properties

In the Athena console, you can specify your Spark properties when you create a notebook or edit a current session.

To add properties in the Create notebook or Edit session details dialog box
  1. Expand Spark properties.

  2. To add your properties, use the Edit in table or Edit in JSON option.

    • For the Edit in table option, choose Add property to add a property, or choose Remove to remove a property. Use the Key and Value boxes to enter property names and their values.

      • To add a custom .jar file, use the spark.jars property.

      • To specify a package file, use the spark.jars.packages property.

    • To enter and edit your configuration directly, choose the Edit in JSON option. In the JSON text editor, you can perform the following tasks:

      • Choose Copy to copy the JSON text to the clipboard.

      • Choose Clear to remove all text from the JSON editor.

      • Choose the settings (gear) icon to configure line wrapping or choose a color theme for the JSON editor.

Notes

  • You can set properties in Athena for Spark, which is the same as setting Spark properties directly on a SparkConf object.

  • Start all Spark properties with the spark. prefix. Properties with other prefixes are ignored.

  • Not all Spark properties are available for custom configuration on Athena. If you submit a StartSession request that has a restricted configuration, the session fails to start.

    • You cannot use the spark.athena. prefix because it is reserved.

Using the Amazon CLI or Athena API to provide custom configuration

To use the Amazon CLI or Athena API to provide your session configuration, use the StartSession API action or the start-session CLI command. In your StartSession request, use the SparkProperties field of EngineConfiguration object to pass your configuration information in JSON format. This starts a session with your specified configuration. For request syntax, see StartSession in the Amazon Athena API Reference.

Troubleshooting session start errors

When a custom configuration error occurs during a session start, the Athena for Spark console shows an error message banner. To troubleshoot session start errors, you can check session state change or logging information.

Viewing session state change information

You can get details about a session state change from the Athena notebook editor or from the Athena API.

To view session state information in the Athena console
  1. In the Athena notebook editor, from the Session menu on the upper right, choose View details.

  2. View the Current session tab. The Session information section shows you information like session ID, workgroup, status, and state change reason.

    The following screen capture example shows information in the State change reason section of the Session information dialog box for a Spark session error in Athena.

    
                            Viewing session state change information in the Athena for Spark
                                console.
To view session state information using the Athena API
  • In the Athena API, you can find session state change information in the StateChangeReason field of SessionStatus object.

Note

After you manually stop a session, or if the session stops after an idle timeout (the default is 20 minutes), the value of StateChangeReason changes to Session was terminated per request.

Using logging to troubleshoot session start errors

Custom configuration errors that occur during a session start are logged by Amazon CloudWatch. In your CloudWatch Logs, search for error messages from AthenaSparkSessionErrorLogger to troubleshoot a failed session start.

For more information about Spark logging, see Logging Spark application events in Athena.

For more information about troubleshooting sessions in Athena for Spark, see Troubleshooting sessions.