# Installation

You can add the SingleStore Spark Connector to your Spark application using Spark-Shell, PySpark, or spark-submit by running the following command:

```shell
$SPARK_HOME/bin/spark-shell --packages com.singlestore:singlestore-spark-connector_2.12:<insert-connector-version>-spark-<insert-spark-version>

```

Before running the command, update the connector and spark version in the command. For example,

```shell
$SPARK_HOME/bin/spark-shell --packages com.singlestore:singlestore-spark-connector_2.12:4.0.0-spark-3.2.0
```

You can also use Maven or SBT to integrate SingleStore with Spark.

## Integrate SingleStore with Spark Using Maven

To integrate/connect Spark to SingleStore using Maven:

1. Log in to the machine where you want to create the Maven project.

2. Create an empty Maven project (only contains **pom.xml** and the **src** directory):
   ```shell
   mvn archetype:generate -DgroupId=example \
   -DartifactId=SparkSingleStoreConnection \
   -DarchetypeArtifactId=maven-archetype-quickstart \
   -DinteractiveMode=false

   ```
   **Note**: Maven uses a set of identifiers, also called coordinates, to uniquely identify a project and specify how the project artifact should be packaged:

   * `groupId` – a unique base name of the company or group that created the project
   * `artifactId` – a unique name of the project
   * `archetypeArtifactId` – a project template that contains only a pom.xml file and src directory

3. Update the **pom.xml** file in your project to include the SingleStore Spark Connector dependency. Your **pom.xml** file may be different based on your project’s required dependencies and your version of Spark. Here's a sample **pom.xml** file:
   ```xml
   <?xml version="1.0" encoding="UTF-8"?>
   <project xmlns="http://maven.apache.org/POM/4.0.0"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
       <modelVersion>4.0.0</modelVersion>

       <groupId>org.example</groupId>
       <artifactId>SparkSingleStoreConnection</artifactId>
       <version>1.0-SNAPSHOT</version>
       <build>
           <plugins>
               <plugin>
                   <groupId>org.apache.maven.plugins</groupId>
                   <artifactId>maven-compiler-plugin</artifactId>
                   <version>3.8.0</version>
                   <configuration>
                       <source>1.8</source>
                       <target>1.8</target>
                   </configuration>
               </plugin>
               <plugin>
                   <artifactId>maven-shade-plugin</artifactId>
                   <version>2.4.1</version>
                   <executions>
                       <execution>
                           <phase>package</phase>
                           <goals>
                               <goal>shade</goal>
                           </goals>
                           <configuration>
                               <filters>
                                   <filter>
                                       <artifact>*:*</artifact>
                                       <excludes>
                                           <exclude>META-INF/*.RSA</exclude>
                                           <exclude>META-INF/*.SF</exclude>
                                           <exclude>META-INF/*.inf</exclude>
                                       </excludes>
                                   </filter>
                               </filters>
                               <transformers>
                                   <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
                                   <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                                       <resource>reference.conf</resource>
                                   </transformer>
                                   <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                       <mainClass>{main-class-name}</mainClass>
                                   </transformer>
                               </transformers>
                           </configuration>
                       </execution>
                   </executions>
               </plugin>
           </plugins>
       </build>

       <dependencies>
           <dependency>
               <groupId>org.apache.spark</groupId>
               <artifactId>spark-sql_2.12</artifactId>
               <version>{insert-spark-version}</version>
           </dependency>
           <dependency>
               <groupId>org.apache.spark</groupId>
               <artifactId>spark-core_2.12</artifactId>
               <version>{insert-spark-version}</version>
           </dependency>
           <dependency>
               <groupId>com.singlestore</groupId>
               <artifactId>singlestore-spark-connector_2.12</artifactId>
               <version>{insert-connector-version}-spark-{insert-spark-version}</version>
           </dependency>
           <dependency>
               <groupId>junit</groupId>
               <artifactId>junit</artifactId>
               <version>3.8.1</version>
               <scope>test</scope>
           </dependency>
       </dependencies>

   </project>

   ```

4. Update the **pom.xml** file with names appropriate to your app/environment:

   * Change the name of your parent folder.
   * Enter the target main class `{main-class-name}` in the tag.
   * Replace the `{insert-spark-version}` and `{insert-connector-version}` with the appropriate Spark and SingleStore connector versions, respectively.
   * Build the project from the parent directory using the following command:
     ```shell
     mvn clean package
     ```

You are ready to run the executable.

## Integrate SingleStore with Spark Using SBT

To integrate and connect Spark to SingleStore using SBT:

1. Log in to the machine where you want to create the SBT project.

2. Create the following directory structure to encompass the SBT project:
   ```
   SparkSingleStoreSBT
     |── build.sbt
     |── project
       |── plugins.sbt
     |── src
       |── main
         |── scala
           |── Reader.scala
           |── Writter.scala

   ```

3. Add the following content to the **plugins.sbt** file, in addition to any other dependencies required by your project:
   ```SBT
   addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.15.0")
   ```

4. Add the following content to the **build.sbt** file, in addition to any other additional dependencies required for your project. Your file may be different based on your version of Spark and other required project dependencies. Here's a sample **build.sbt** file:
   ```SBT
   name := "SparkSingleStoreConnector"

   version := "0.1"

   scalaVersion := "2.12.12"

   mainClass := Some("Reader")

   val sparkVersion = "{spark-version}"

   libraryDependencies += "org.apache.spark" %% "spark-sql" % sparkVersion
   libraryDependencies += "com.singlestore" % "singlestore-spark-connector_2.12" % "{connector-version}-spark-{spark-version}"

   assemblyMergeStrategy in assembly := {
     case PathList("META-INF", xs @ _*) =>
       xs map {_.toLowerCase} match {
         case "manifest.mf" :: Nil | "index.list" :: Nil | "dependencies" :: Nil =>
           MergeStrategy.discard
         case ps @ x :: xs if ps.last.endsWith(".sf") || ps.last.endsWith(".dsa") =>
           MergeStrategy.discard
         case "plexus" :: xs =>
           MergeStrategy.discard
         case "services" :: xs =>
           MergeStrategy.filterDistinctLines
         case "spring.schemas" :: Nil | "spring.handlers" :: Nil =>
           MergeStrategy.filterDistinctLines
         case _ => MergeStrategy.first
       }
     case "application.conf" => MergeStrategy.concat
     case "reference.conf" => MergeStrategy.concat
     case _ => MergeStrategy.first
   }

   ```
   Replace the `{spark-version}` and `{connector-version}` with the appropriate Spark and SingleStore connector versions, respectively.

5. Develop your Spark application including SingleStore as the datastore for load and sink.

6. Package your application by setting the target main class in the **build.sbt** file:

   * Choose the target main class in the `mainClass := Some("target_main_class_name")`
   * Build the project from the parent directory using the following command:
     ```shell
     sbt clean assembly
     ```

You are ready to run the executable.

***

Modified at: May 4, 2023

Source: [/db/v9.1/load-data/integrate-with-singlestore/load-data-from-spark/installation/](https://docs.singlestore.com/db/v9.1/load-data/integrate-with-singlestore/load-data-from-spark/installation/)

(An index of the documentation is available at /llms.txt)
