Write to Delta
SparkDeltaDestination
Bases: DestinationInterface
The Spark Delta Destination is used to write data to a Delta table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
DataFrame
|
Dataframe to be written to Delta |
required |
table_name |
str
|
Name of the Hive Metastore or Unity Catalog Delta Table |
required |
options |
dict
|
Options that can be specified for a Delta Table write operation (See Attributes table below). Further information on the options is available for batch and streaming. |
required |
mode |
str
|
Method of writing to Delta Table - append/overwrite (batch), append/complete (stream) |
'append'
|
trigger |
str
|
Frequency of the write operation |
'10 seconds'
|
query_name |
str
|
Unique name for the query in associated SparkSession |
'DeltaDestination'
|
Attributes:
Name | Type | Description |
---|---|---|
checkpointLocation |
str
|
Path to checkpoint files. (Streaming) |
txnAppId |
str
|
A unique string that you can pass on each DataFrame write. (Batch & Streaming) |
txnVersion |
str
|
A monotonically increasing number that acts as transaction version. (Batch & Streaming) |
maxRecordsPerFile |
int str
|
Specify the maximum number of records to write to a single file for a Delta Lake table. (Batch) |
replaceWhere |
str
|
Condition(s) for overwriting. (Batch) |
partitionOverwriteMode |
str
|
When set to dynamic, overwrites all existing data in each logical partition for which the write will commit new data. Default is static. (Batch) |
overwriteSchema |
bool str
|
If True, overwrites the schema as well as the table data. (Batch) |
Source code in src/sdk/python/rtdip_sdk/pipelines/destinations/spark/delta.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
|
system_type()
staticmethod
Attributes:
Name | Type | Description |
---|---|---|
SystemType |
Environment
|
Requires PYSPARK |
Source code in src/sdk/python/rtdip_sdk/pipelines/destinations/spark/delta.py
60 61 62 63 64 65 66 |
|
write_batch()
Writes batch data to Delta. Most of the options provided by the Apache Spark DataFrame write API are supported for performing batch writes on tables.
Source code in src/sdk/python/rtdip_sdk/pipelines/destinations/spark/delta.py
87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
|
write_stream()
Writes streaming data to Delta. Exactly-once processing is guaranteed
Source code in src/sdk/python/rtdip_sdk/pipelines/destinations/spark/delta.py
108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
|