Class AbstractBulkWriterContext

    • Field Detail

      • KRYO_REGISTRATION_WARNING

        public static final java.lang.String KRYO_REGISTRATION_WARNING
        Use the implementation of the KryoSerializable interface as a detection device to make sure SbwKryoRegistrator is properly in place.

        If this class is serialized by Kryo, it means we're not set up correctly, and therefore we log and fail. This failure will occur early in the job and be very clear, so users can quickly fix their code and get up and running again, rather than having a random NullPointerException further down the line.

        See Also:
        Constant Field Values
    • Constructor Detail

      • AbstractBulkWriterContext

        protected AbstractBulkWriterContext​(@NotNull
                                            BulkSparkConf conf,
                                            @NotNull
                                            org.apache.spark.sql.types.StructType structType,
                                            @NotNull
                                            int sparkDefaultParallelism)
        Constructor for driver usage. Builds all components fresh on the driver.
        Parameters:
        conf - Bulk Spark configuration
        structType - DataFrame schema
        sparkDefaultParallelism - Spark default parallelism
      • AbstractBulkWriterContext

        protected AbstractBulkWriterContext​(@NotNull
                                            BulkWriterConfig config)
        Constructor for executor usage. Reconstructs components from broadcast configuration on executors. This is used by the factory method BulkWriterContext.from(BulkWriterConfig).
        Parameters:
        config - immutable configuration for the bulk writer with pre-computed values
    • Method Detail

      • sparkDefaultParallelism

        protected final int sparkDefaultParallelism()
      • lowestCassandraVersion

        protected java.lang.String lowestCassandraVersion()
      • buildClusterInfo

        protected abstract ClusterInfo buildClusterInfo()
      • reconstructClusterInfoOnExecutor

        protected ClusterInfo reconstructClusterInfoOnExecutor​(IBroadcastableClusterInfo clusterInfo)
        Reconstructs ClusterInfo on executors from broadcastable versions. This method is only called on executors when reconstructing BulkWriterContext from broadcast BulkWriterConfig. Each broadcastable type knows how to reconstruct itself into the appropriate full ClusterInfo implementation.
        Parameters:
        clusterInfo - the BroadcastableClusterInfo from broadcast
        Returns:
        reconstructed ClusterInfo (CassandraClusterInfo or CassandraClusterInfoGroup)
      • reconstructJobInfoOnExecutor

        protected JobInfo reconstructJobInfoOnExecutor​(BroadcastableJobInfo jobInfo)
        Reconstructs JobInfo on executors from BroadcastableJobInfo. This method is only called on executors when reconstructing BulkWriterContext from broadcast BulkWriterConfig. It rebuilds CassandraJobInfo with TokenPartitioner reconstructed from the broadcastable partition mappings.
        Parameters:
        jobInfo - the BroadcastableJobInfo from broadcast
        Returns:
        reconstructed CassandraJobInfo
      • reconstructSchemaInfoOnExecutor

        protected SchemaInfo reconstructSchemaInfoOnExecutor​(BroadcastableSchemaInfo schemaInfo)
        Reconstructs SchemaInfo on executors from BroadcastableSchemaInfo. This method is only called on executors when reconstructing BulkWriterContext from broadcast BulkWriterConfig. It reconstructs CassandraSchemaInfo and TableSchema from the broadcast data (no Sidecar calls needed).
        Parameters:
        schemaInfo - the BroadcastableSchemaInfo from broadcast
        Returns:
        reconstructed CassandraSchemaInfo
      • validateKeyspaceReplication

        protected abstract void validateKeyspaceReplication()
      • buildJobInfo

        protected JobInfo buildJobInfo()
      • generateRestoreJobIds

        protected abstract MultiClusterContainer<java.util.UUID> generateRestoreJobIds()
        Generate the restore job IDs used in the receiving Cassandra Sidecar clusters. In the coordinated write mode, there should be a unique uuid per cluster; In the single cluster write mode, the MultiClusterContainer would contain one single entry.
        Returns:
        restore job ids that are unique per cluster
      • buildCassandraBridge

        protected org.apache.cassandra.bridge.CassandraBridge buildCassandraBridge()
      • buildTransportContext

        protected TransportContext buildTransportContext​(boolean isOnDriver)
      • buildJobStatsPublisher

        protected org.apache.cassandra.spark.common.stats.JobStatsPublisher buildJobStatsPublisher()
      • findLowestCassandraVersion

        protected java.lang.String findLowestCassandraVersion()
      • buildSchemaInfo

        protected SchemaInfo buildSchemaInfo​(org.apache.spark.sql.types.StructType structType)
      • bridge

        public org.apache.cassandra.bridge.CassandraBridge bridge()
        Specified by:
        bridge in interface BulkWriterContext
      • jobStats

        public org.apache.cassandra.spark.common.stats.JobStatsPublisher jobStats()
        Specified by:
        jobStats in interface BulkWriterContext
      • initializeTableSchema

        @NotNull
        protected TableSchema initializeTableSchema​(@NotNull
                                                    BulkSparkConf conf,
                                                    @NotNull
                                                    org.apache.spark.sql.types.StructType dfSchema,
                                                    TableInfoProvider tableInfoProvider,
                                                    java.lang.String lowestCassandraVersion)
      • createTransportContext

        @NotNull
        protected TransportContext createTransportContext​(boolean isOnDriver)
      • write

        public void write​(com.esotericsoftware.kryo.Kryo kryo,
                          com.esotericsoftware.kryo.io.Output output)
        Specified by:
        write in interface com.esotericsoftware.kryo.KryoSerializable
      • read

        public void read​(com.esotericsoftware.kryo.Kryo kryo,
                         com.esotericsoftware.kryo.io.Input input)
        Specified by:
        read in interface com.esotericsoftware.kryo.KryoSerializable